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1 TTCGCCTGCG GGCCGGCACT GCTCACCTCT CGTCCAGGGA CATGACGGGC 
51 ACGCCAGGCG CCGTTGCCAC CCGGGATGGC GAGGCCCCCG AGCGCTCCCC 
101 GCCCTGCAGT CCGAGCTACG ACCTCACGGG CAAGGTGATG CTTCTGGGAG 
151 ACACAGGCGT CGGCAAAACA TGTTTCCTGA TCCAATTCAA AGACGGGGCC 
201 TTCCTGTCCG GAACCTTCAT AGCCACCGTC GGCATAGACT TCAGGAACAA 
251 GGTGGTGACT GTGGATGGCG TGAGAGTGAA GCTGCAGATC TGGGACACCG 
301 CTGGGCAGGA ACGGTTCCGA AGCGTCACCC ATGCTTATTA CAGAGATGCT 
351 CAGGCCTTGC TTCTGCTGTA TGACATCACC AACAAATCTT CTTTCGACAA 
401 CATCAGGGCC TGGCTCACTG AGATTCATGA GTATGCCCAG AGGGACGTGG 
451 TGATCATGCT GCTAGGCAAC AAGGCGGATA TGAGCAGCGA AAGAGTGATC 
501 CGTTCCGAAG ACGGAGAGAC CTTGGCCAGG GAGTACGGTG TTCCCTTCCT 
551 GGAGACCAGC GCCAAGACTG GCATGAATGT GGAGTTAGCC TTTCTGGCCA 
601 TCGCCAAGGA ACTGAAATAC CGGGCCGGGC ATCAGGCGGA TGAGCCCAGC 
651 TTCCAGATCC GAGACTATGT AGAGTCCCAG AAGAAGCGCT CCAGCTGCTG 
701 CTCCTTCATG TGAATCCCAG GGGGCAGAGA GGAGGCTCTG GAGGCACACA 
751 GGATGCAGCC TTCCCCCTCC CAGGCCTGGC TTATTCCAAG AGGCTGAGCC 
801 AATGGGGAGA AAGATGGAGG ACTCACTGCA CAGCCGCTTC CTAGCAGGGA 
851 GCTATACTCC AACTCCTACT TGAGTTCCTG CGGTCTCCCC GCATCCACAG 
901 GGAGGGTAAA ACACTTAGCT TTTATTTTAA TAGTACATAA TTTAATACTCA 
951 AAAAAGGCGC CTGGATCCCC AAAAAACCGA GGCTGGGAGC TAGTGGCCCT 



1051 GGTGGTGTTG CTCCAGCTCA GCCCCAGGGG ACACAGATGC ACTTTGGGGG 
1101 TGAGGGCAGG TAATGACTCC ATCGCACCCT CAGTTCAGCT GGACAGAGGC 
1151 TCAGGTGACC CCAGCCTTCA CTGTCTCCCG CTCTCCAGGA GCTTATCTTC 
1201 GCCCCATCTC CCAAATAAGT GGGCCCTTGT GCTGTGAGGA AGACCAAAGC 
1251 CTCAGGGAAG ATAAGAGATA TGGAGATGGG AGGGGGAGGA CAAGGGGCAG 
1301 AGAGTAGGGT CTAGCTGGCT ATCTCTGGCC TTACTAACAC CCCCCTGGAG 
1351 GCATGCCCCT TTTCTCCAGC ACACAAGCAC ATTGGGGCAC CTGGAAATAT 
14 01 TGGTTCCAGG CTCCTGTTCT CTGGACTTCA GATCCTGGGG GAGCCCCTCC 
1451 CCCCCCTGAA TCCCTGGCTT AGCTACCTTC CTGCCTGTGC ACCTAAAAAC 
1501 CTCAGGTCAG AACTAGGAAA AGAGTTTTGT TTTTATTTTT TTGAAATGGA 
1551 GTCTCGTTCT GTCGCCCAGG CTGAGGTGCA GTAGTGCAAT CTCCGCTCAC 
1601 TACAACCTCC ACTCCCTGGG GCTCAAGCGA TCCTCCCACC TCAGCCGCCG 
1651 AAGTAGCTGG GACTATAGGT GTGTACCATC ACACCTGGCT AATTTTTGTA 
1701 TTTTTTGTAG ACACAGGGTT TCGCCATGTT GCCCAGGCTG GTCTTGAATT 
1751 CCTGAGCTCA AGCAACCTGC CGGCCTCGGC CTCCCAAAGT ACTGGGATTA 
1801 CACGCAGAAG GCACCATGCC CAGGCTAGAT GTGTCTTATC CCAATCCTTT 
1851 GGCAGGCATG CAGCTCCACA GGCGATTTCT TCAAGCAGCT GAAGTGTTTA 
1901 GCCCTCCTGG GTTAAGAGCC AGATAAGGAG AAATCCCTTT CCTAGGTTTG 
1951 GAATGTGTTG TGAAAAAAAA GAGAAATCCC TGGCTCCTGG AGCTGGTGGG 
2001 AGACAAGATT AAGCAAACCT CCCCTGACAT GTATCCCTTT GACCCCAAGC 
2051 TCTGCCTCCT CCCTGACCAC CCATGCCCTT TCCTTTAACT TCTCAAACAG 
2101 ATACCAGGGC CTAAACTGCT TTACCTCCCC TCCTACTGAG TCAGGTTAGG 
2151 TGGTGGGAGG TCACCCATTT CCGAGTTAAA CCAATGCAAT ATGAGTAAAA 
2201 CAAAGTCATG TGGGTATGTC TGGGGTAGAG AGAGGGGTAG CAAGTTCATG 
2251 TGTCCTCCTT GGTCACATAT CTCCCAAAGC TCTGATCCCT GCCATGGGAA 
2301 GTGGACAGGA AACATGAGGT CATGACCTGC AGGCATCTTT ACTGCAGCTC 
2351 TGCCGGCCTG GAGGGGGAGA GGGGGAGGAA GAAGTATGCG CTGCACATTT 
2401 CTGAGGCTAC TGCATTTGCT TTCAAGGCAG AAATCTTGCT CTGAGCAGTC 
2451 AGCGGCTCCA GTTTGGGCCC GATAAGGAAG TTCTCCGTGG CCTCCCTCAG 
2501 GCAGAGCAGG GAGGAGGCTG ACATTGCCAG TCTCTTCTGG GGCCCAAGGC 
2551 AGGTTGCAGG AGATCCAATC CCATAGACAG CTCTGGGCCT CTTGCATTTG 
2601 AGTTTTTCAG AATTAAACTG CAGTATTTTG GAAAGCAAAA AAAAAAAAAA 
2651 AAAAAAAAAA AAAAAAAAAA AAAA (SEQ ID NO:l) 



FEATURES : 

5'UTR: 1-41 

Start Codon: 42 

Stop Codon: 711 

3'UTR: 714 
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Homologous proteins: 

Top 10 BLAST Hits 



CRA 
CRA 
CRA 
CRA 
CRA 
CRA 
CRA 
CRA 
CRA 
CRA 
CRA 



103000001517087 /altid=gi | 10946770 /def =ref | NP_067386 . 1 | RA. 
1000682330460 /altid=gi | 7657492 /def =ref | NP_055168 . 1 | RAB26 . 



18000004977238 /altid=gi 

18000005013109 /altid=gi 

89000000198627 /altid=gi 

18000005076419 /altid=gi 

18000004912300 /altid=gi 

98000043536338 /altid=gi 

18000004929618 /altid=gi 

18000004952869 /altid=gi 

18000005221564 /altid=gi 



1710022 /def=sp|P51156|RB26_RAT RA. 
1083775 /def=pir| | JC2528 GTP-bindi . 
7296421 /def =gb | AAF51708.1] (AE003 . 
7438397 /def =pir | | T15123 hypotheti. 
134236 /def=sp|P2079l|SAS2_DICDI G. 
12963499 /def =ref | NP_075615 . 1 | cel. 
131798 /def = sp|P24407|RAB8_HUMAN R. 
131848 /def=sp|P22128|RAB8_DISOM R. 
4586580 /def^dbj |BAA76422.1| (AB02 . 



BLAST dbEST hits: 



gi | 13033710 /dataset=dbest /taxon=960. 
gi | 12785775 /dataset=dbest /taxon=960. 

ni_Ll_OQr\y1 O "5 £T / ^ ^ r, ^ ^ _ « +- / *- Ci C Oi 

gi I 9093496 /dataset=dbest /taxon=9606. 







Score 


Ci 


/toe 
ft ZD 


el i 7 
e -L J. / 


zy / 


^e- / y 


O OA 

z y*± 


oc" ro 


293 


7e~78 


Z / j 


y c /Z 


207 


4e-52 


1 A O 

zU J 


1 A CI 

/e- oi 




ye- bi 


202 


le-50 


202 


2e-50 


202 


2e-50 


Score 


E 


1318 


0.0 


1316 

i rt r> it 


0.0 

r\ r\ 


X V J J 

694 


0.0 



EXPRESSION INFORMATION FOR MODULATORY USE: 

library source: 

From BLAST dbEST hits: 

gi 1 13033710 prostate 

gi j 12785775 brain 

gi j 12904236 T cells from T cell leukemia 
gi j 9093496 leukopheresis 



From tissue screening panels: 
leukocyte 
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MTGTPGAVAT RDGEAPERSP PCSPSYDLTG KVMLLGDTGV GKTCFLIQFK 
DGAFLSGTFI ATVGIDFRNK WTVDGVRVK LQIWDTAGQE RFRSVTHAYY 
RDAQALLLLY DITNKSSFDN IRAWLTEIHE YAQRDWIML LGNKADMSSE 
151 RVIRSEDGET LAREYGVPFL ETSAKTGMNV ELAFLAIAKE LKYRAGHQAD 
201 EPSFQIRDYV ESQKKRSSCC SFM (SEQ ID NO: 2) 



FEATURES : 

Functional domains and key regions: 

[1] PDOC00001 PS00001 ASN_GLYCOSYLATION 
N-glycosylation site 




114-117 NKSS (SEQ ID NO: 5) 



[2] PDOC00004 PS00004 CAMP_PHOSPHO_SITE 

cAMP- and cGMP- dependent protein kinase phosphorylation site 



Number of matches: 2 

1 214-217 KKRS (SEQ ID NO:6) 

2 215-218 KRSS (SEQ ID NO: 7) 

[3] PDOC00005 PS00005 PKC_PHOSPHO_SITE 
Protein kinase C phosphorylation site 



Number of matches: 5 

1 29-31 TGK 

2 113-115 TNK 

3 149-151 SER 

4 173-175 SAK 

5 212-214 SQK 



[4] PDOC00006 PS00006 CK2_PHOSPHO_SITE 
Casein kinase II phosphorylation site 



116-119 SSFD (SEQ ID NO: 8) 



[5] PDOC00008 PS00008 MYRISTYL 
N-myristoylation site 



Number of matches: 5 
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3-8 


GTPGAV 


(SEQ 
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GAFLSG 
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[6] PDOC00017 PS00017 ATP_GTP_A 
ATP/GTP-binding site motif A (P-loop) 



36-43 GDTGVGKT (SEQ ID NO: 14) 



[7] PDOC00579 PS00675 SIGMA54_INTERACT_1 

Sigma-54 interaction domain ATP-binding region A signature 



32-45 VMLLGDTGVGKTCF (SEQ ID NO: 15) 



Membrane spanning structure and domains; 

Helix Begin End Score Certainty 
1 48 68 0 . 715 Putative 
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BLAST Alignment to Top Hit: 

>CRA| 103000001517087 /altid=gi | 10946770 /def =ref | NP_067386 . 1 | RAB37 , 
member of RAS oncogene family; GTPase Rab37 [Mus 
musculus] /org=Mus musculus /taxon=10090 /dataset=nraa ' 
/length=223 
Length = 223 

Score = 425 bits (1081), Expect = e-117 

Identities = 209/223 (93%), Positives = 215/223 (95%) 

Frame = +3 



Query: 42 MTGTPGAVATl^GEAPERSPPCSPSYDLTGKVMLLGDTGVGKTCFLIQFKDGAFLSGTFI 221 

MTGTPGA DGEAPERSPP SP+YDLTGKVMLLGD+GVGKTCFLIQFKDGAFLSGTFI 
Sbjct: 1 MTGTPGAATAGDGEAPERSPPFSPNYDLTGKVMLLGDSGVGKTCFLIQFKDGAFLSGTFI 60 

Query: 222 ATVG I D FRNKWTVDGVRVKLQ I WDTAGQERFRSVTHAY YRDAQALLLL YD I TNKS S FDN 401 

ATVG I D FRNKWTVDG RVKLQIWDTAGQERFRSVTHAYYRDAQALLLLYDITN+SSFDN 
Sbjct: 61 ATVGIDFRNKWTVDGARVIOiQIWDTAGQERFRSVTHAYYRDAQALLLI.YDITNQSSFDN 120 

Query: 402 IRAWLTEIHEYAQRDWIMLLGNKADMSSERVIRSEDGETLAREYGVPFLETSAKTGMNV 581 

Sbjct: 121 IRAWLTEIHEYAQRDWIMLLGNKADVSSERVIRSEDGETLAREYGVPFMETSAKTGMNV 180 

Query: 582 ELAFLAIAKELKYRAGHQADEPSFQIRDYVESQKICRSSCCSFM 710 (SEQ ID NO:2) 

ELAFLAIAKELKYRAG Q DEPSFQIRDYVESQKKRSSCCSF+ 
Sbjct: 181 ELAFLAIAKELKYRAGRQPDEPSFQIRDYVESQKKRSSCCSFV 223 (SEQ ID NO: 4) 



Hmmer search results (Pfam) 

Model Description 



Score 



E-value N 



PF00071 Ras family 

CE00060 CE00060 rab_ras_like 

PF01142 Uncharacterized protein family UPF0024 



Parsed for domains: 
Mode 1 Doma i n seq-f seq-t 



hmm-f hmm-t 



306.9 
213.3 
2.6 



score E-value 



CE00060 1/1 
PF01142 1/1 
PF00071 1/1 



31 191 .. 25 193 . . 213.3 3.7e-60 
185 201 .. 444 462 .] 2.6 3.4 

31 223 .] 1 198 [] 306.9 8.4e-90 



8.4e-90 
3 .7e-60 
3.4 
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1 AGGGGAGAGA AAAGACCGCA TACCAGGCCA GGTGCGGTGG CTCACGCTTG 
51 TAATCCCAGC AATTTGGAAG GCCAAGGCAG GCGTATCGCC TGAGGTCAGC 
101 AGTTCCAAAC CAGCCTGTCC AACATGGTGA AGTTCTCTAC TAAGAATACA 
151 AAAATTACCC AGGCGTGGTG GCGTGCACCT GTAGTCCCAG CTGCTCCAGA 
201 GGCTGAGGCA GGAGAATTGC TTGAACCTGG GAGGCAGAGG CTGCAATGCG' 
251 CCAAGATCCC GCCACTGCAC TCCAGCCTGG GCGACAGAGT GAGACTCCGT 
301 CTCCGGGAGC CCACGGCATT GAGCAAACCT CGGCATTATT TGCAGCAAGA 
351 GCCTCTGGCA TCCAAATAGC AACCAACACC ACGCCTCTGT AGTGTGCTGC 
401 GCAGCCTCCA CACTCCAGTC TGAGGCTCCC TGTTTGAGTC CCGCCCTATG 
451 CCCAGCTGAG GTTATAGCAC GCTCACCTCC AGAAGAGGTA ACCCAAGCTC 
501 TTTACTCTAC TGGAGATCAC CTCTGTCCCC ACTCTGGGCG CTTCTCCCAG 
551 CTGACAGAAA ATACCTCCAG CTGATGTCAG AAAATACAGG GCTGGAGGCT 
601 GGCGTACAAA GTCAGTCCCC ACAGGCCTAT GGTGGCCCAT AAGCCACGTC 
651 TACCCCTGCT CCTCACCTCC ACACCTAAGT TAAGAATTGC AGGCCGGGCG 
701 CAGTGGCTCA CGCCTGTAAT CCCAGCACTT TGGGAGGCTG AGGTGGGCGG 
751 ACCGCCTGAG GTCAGGAATT TGAGACCAGC TTGGCCAACA TGGCAAAACC 
801 CCGTCTCTAC TAAAAATACA AAAAGAAAAA ATAGCCGGGC CTGATGTCGC 
851 GCACCTGTAA TCCCAGCTAC TCCGGGAGAC TGAGGCGGGA GTATAGCTTG 
901 AACCCGGGAA GCAAAGGTTG CAGTGAGGCG AGATCGCACC ACTGCACTCC 
951 AGGCTGGGCG ACAGAGTGAG ACTCTGTCTG AAAAAAAAAA AAAGTGCAGG 
1001 TACCCCTCTC CAGCTCTCCC" CTCCCTACAC ATCCCTCAAA" CCGTCCCGCT 
1051 GTAATGCACC CGCCCTGTTC CTTGGTAACT TGAAGCTGCT TATAGAATGT 
1101 GGAGATGGGG GTAATTGAAA GGTCGGCCCA GGCCACAGAG CCCCTGAGCT 
1151 CTGCTACCGG CAACCCCAGC TGCACTCCCC ACTCTCTGTC ACCAGGAGCT 
1201 GCCGGGTGCC TGGGATATCC TGGCAGCTCT GCTCAAAATG ATCTACGACT 
1251 TCATGAATTT ATTTGGCTCC TCCTCGGGGC CAGGGTGAGT GTCATGGGTT 
1301 AATAAGGCCG GCCCCGCCTT CAGGAGCGGT CCACTGGGAG ATGTGTGCTG 
1351 CGCAGCCCTC TTGCGAAAGC TCTCCCCTGG TGGGACATTC TGGGCACAAC 
1401 CAACAGGCCG GGGGAAATGA GAGGTGATCC ATACTAAAGG GTCAAAGTCC 
1451 CCGCACCAGG CAGAGGCCCC AAAACACCGC AGCGTACATG TGCTGCAAGG 
1501 CGAGTACGGG TTGGTAAACA AAACTATATT CAGATGAGCT CGGGCCGGGT 
1551 GACTTAACAG ATGAGGAAGT GTCTCGGGGC CATCGGCGGA GGCGCAGCCC 
1601 AGGGGTCCCC AGCTCCCCGC CTCGCCACCT GGGGACAGCC CACGGCCCGG 
1651 GGCTCGGGCG CCGCCTGCTG TCGCGGTGCG CAGCGACTAC GGGAACTCTT 
1701 CCGCAGCAGA CGGGGTCCCC GCGGCCCGCT CCCCCAGGGG CAAGCAAGCG 
1751 ACCACAGGGG ACCGGTCCCG GGGCTGGATG TGGCTCATGT CCGAAGCGCA 
1801 CGGAGCCGAG CCGGTGTTGC TCAGGGAGGC TGCCCGCCCC TTCACGCAGA 
1851 CCCTGCGGCT CTGCGTGCCC TCAGGGAACA GCAAGGTCCG AGCCGGTGTC 
1901 GTCGAGGGGG CGACGGGACG GAGGGAGGAG CCTGAGGGGT CCCGGTCGAG 
1951 GGAGGGGAGG AGTGGGCGGG GCGGGGGTGG GGGCCGTTCC CGCGCTCTCC 
2001 TTCGCCTGCG GGCCGGCACT GCTCACCTCT CGTCCAGGGA CATGACGGGC 
2051 ACGCCAGGCG CCGTTGCCAC CCGGGATGGC GAGGCCCCCG AGCGCTCCCC 
2101 GCCCTGCAGT CCGAGCTACG ACCTCACGGG CAAGGTGGGT GGGCCTCTTC 
2151 CGTGAGACCC CCGCCCTCCT CGGCGCTAGC CCCTTCCTGG CTGCGTCTGG 
2201 GTTGGACTCA GCCCTTCCCC CAGGCAGCTG CGTCTCCCAG AGGAGGGAGG 
2251 GAGAGAGGGT CAGGACACAG CCTCTGGGGC CGTCCCAAGC TCTAGGTGTC 
2301 TCTGCTGGCT TGGTGGGGGC GGGTCGCGGA AGATCGCAAA AACTGAGTGA 
2351 TCCCCCCGCC GGCCCCAACT CAGTTCTCTT CTGCCACACT CTGGCAAATA 
2401 TGAGCCCCCG GGAGCCCATG CTTCTTGGTG AGGGTTAAGC GCGCAACTCT 
2451 CGGGGCTCAG GCTGGGAAGG GCTGGGAGAT GGGGACCGAA CGGAGACTCG 
2501 GAGAGGACGT CCCCTGCTGG CAGAGGAACT GGCGTTAATG CCATTTTCCG 
2551 AGCTAAGCTC TTAGTTGAGA TCTGACATCC AGGTTTAAGG CCTGATGTCC 
2601 CCCAGCTGCT CCCCTCCCAT TCCACCCGCT GGAGGCACTG CCTCCCACCT 
2651 TCCTCCCTGC AGTCGGAAGC CGCTCCTCCC AGAAGGATGT TGCCAGCCGG 
2701 CCTGCAGGTC ACTTGGGAAT TTTTCGAACC TGAGAAAGAT TTCAGTGGTT 
2751 GGTCTTTCGC ATCCCGCACT TGAGAGAGCT CCAGGGCTGC TCTCTGGGGC 
2801 TTGCTCCCTC TACAGGGGTG TCCTGTATGG AAACAGGTAG GGACAGCAGT 
2851 GGACTGGTCT GTCGCCTTCC ATCTGTGTCC TTGGAGTGAG CGGGTACCAG 
2901 AAACTGAAAG AACTGCTGAG GGAGCCTAGA GCTTCCACTC TTCCTCTGCA 
2951 GGGTTGGGGA TGGAGTGAGG GCTGTCCTGG ATTCCGCTGC ATGGCCTTGA 
3001 AGGAGACCTG CCTCTCTCTG GGCCTCGGTT TCCTCCCCGA CACCAGGGCT 
3051 CACCCTTGCT GGGAGCCTCA GCCTCCACCC CAGTGTTTCG GGGGAAGCCA 
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3101 CCCTGCAAGT CATCCGCCCA GAGCCGTTGA GATAGGCGTC CTGTGTGGGC 
3151 TTGTGGCAGG AAATGGGCCC CTGCACCCTC GGAGAGGAGG AGCTGCTGTT 
3201 GGCCAGGCCC CAGGCTGAGG GGGACTGCCT GACCTTGTTG CCCTGCAAAC 
3251 CAGCTGGGTT GTTTGCCTAG GAGGTGGCCA GGCTAGGCAG CTGTTTGTGT 
3301 TTGGTGGAAT CACCGAGCTG GGTGGGTAGC TGGCATCGTT TGCTCAAGGC 
3351 AGCTGTGATC TGTAAAGTAC ACAAAGACTG GCCCTCCCTC CCTCCTTCCT 
3401 GCTCCAGGGC TGGGACCCAG GAGCCAGGGA GGAGTGCAGG CTCCAGAAAG 
3451 CTCCTATCCC CCACCCCTTC ATCTGTTCCC TGGCCAAGCG GCATTGGCCG 
3501 GAGAGTTGGT CCCCAGCCTC CCCGGGCCTG CCCCAGGGGA GTGAGTCCAG 
3551 GACCCTCTGA GAAAGCCTGG CAGGAGCTCC TTGGACCAGA CTAGGGGTGA 
3601 TGTGGCCCAC AGGCAGACAG TTCCCACCCT GGGCCACTCT TCCCTGGGTC 
3651 TTAGGTGATT CACCACGATG ATGGGCCCTA GCCATTAACA GACTCTAGAA 
3701 ATACCTCAAA GACATTATCC CTCCTCCTTC TACCCACTAT GGAAACCATG 
3751 CCACAGAAAG GTTAAGGAAT CTTCCTAAAG TCACACAGTA GGCCATTTAC 
3801 AAATCAAGAC CCATCCTTCA TACCCCTTCT GCTCAGCCAC CCCTGCCTCT 
3851 CCACCAGAGT TAACTAATGC CAGTACCCCA TGCCCACAAC AGGAATGCCT 
3901 TTGGGCTCCA CTGTCAATTT CAGAGCCTCA AAAATAATTC AAACCTAGTC 
3951 CCTGCTTAAC CCATTAAGCC ACCTAACCAG CAGCTGGGAA ATTCCAGCAT 
4001 TGGATCTAGA CCCCTGTTAT CCAAGATTGG AGAACAGTGG GACAAAGTGC 
4051 TCCTCTCCAC CATTCCTGCG TGTCCCTGGG GAAGATGAGC AGAGCAGAGC 



4151 ACTCCTGCCC GCACTACCCA CAGCAACCCC GGGATGCCGA TCTGCAGCCA 
4201 CATGTCCCAT GTGGGAGGTT TCTGCTGAAA GAACTTCCAA CTACACATCT 
4251 CCCCACTTCA GTATAAATTT CAACCTTCCC TAATTCATGC AACCTTTTTT 
4301 TTTTTTTTTT TTTTTTGAGA CAGAGTGTCG CTCTGTCACC GAGGCTGGAG 
4351 TTCAGTGATG CAATCTCGGC TCACTGCAAC CTCTACCTCC TGGGTTCAAG 
4401 CTATTCTCCT GTCTCCGCCT CCCAAGTAAC TGGGACTACA GGCGTGTGCC 
4451 ACCACTCCTG GCTAGTTTTT TGTATTTTTA GTAGAGATGG GGTTTCACCT 
4501 TGTTGGTCAG GCTGGTCTCA AACTCCCAAC TCAGGTGATC CGTCCACTTG 
4551 GGCACCCAAA ATGNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
4601 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
4651 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
4701 NNNNNNNNNN NNNNNNNNNN TTCAAGTACC AGCCTGGCCA ACATGGTAGA 
4 751 AACCCCGTCT CTACTAAAAA TAAAAAATTA GCCAGGCGAG GTGGTGCATG 
4801 CCTATAATCC CAGCTACTCA GGTAGGCTGA GGCAGGAGAA TCATTTAAAC 
4851 CTGGGAGGTG GAGGTTGTGG TGAGCCAAGA TCTCGCCATT GCACTCCAGC 
4901 CTGGGCAACA AGAGCAAAAC TCCGTCTCAA AAAAAAAAAG AAAGAAAGAA 
4951 AGAAAGAAAC TTCCAAATAA ATGTTGTGAC ACAAAAAAAA AAACCCAAAC 
5001 AATATTCATT ATAGAGTATG CAAATGACCA TGCCCCACCC CCAGCAGATT 
5051 CTGATAGACT CCCTTGGGTG GGAATCCTTG TCCAATATAT TGACACTTCC 
5101 CTTTCCTGTC AGTATAGCCC AGCCCATGCG TGTACTCACG AGCGGACGAT 
5151 GGATGACACA AGTACACAGA GGGACGGAAT CCCTGCATGG TGTGGCTATG 
5201 GGCAAATGTG GCCACTGTCT AGATTGTGCA AATGTGGTGG TTCTCTGGGG 
5251 CCACAGAGCA CACTTGGGGA CCTGTTCATG GTGAGGTCTC AACTCCGGCC 
5301 TCTAGGAACT TGAATGAGGA CAGGAGGGTC AGAGGGAGAG CCTAGGAGGC 
5351 TGAGCCAAGG AGCGTGGAGA GGAGAGACAG GGTGAAGGTG GCGGCTGGCT 
5401 TTCTGGAAGC AGGTGGCCTT TGGTGCGGTC AGCATTCGTG CCAGCCCCCT 
5451 CTTCTCTGAT CCTCTCCATG TGTCTCTCTC CTGGAATCCC AGAAGCTGCC 
5501 CCTGACTCCC CATTAACTGC CTCTGCCCCT ACCCCCTAGG TGATGCTTCT 
5551 GGGAGACACA GGCGTCGGCA AAACATGTTT CCTGATCCAA TTCAAAGACG 
5601 GGGCCTTCCT GTCCGGAACC TTCATAGCCA CCGTCGGCAT AGACTTCAGG 
5651 GTGAGGTGGC TGCAGGCACT TGCTTCCAGC AGAGAGCCAG GGCTGTGGCT 
5701 CAGGCATGGG GGGGTTGCCC CCACCTTGCT CACCCTGGCT CCCAGGGACT 
5751 CCCGAGGCTC ATGCCTGGAG GGCACACAAC CCGCTCCCCC AAGACCACAG 
5801 AGGTGGCCGG GTCAAAGGAG ACTGGGCAAG GTTGGCTCCT TGCCCAACTA 
5851 TAGGATGCAA AAAAATGAGA CTGAGTCTTC GATTCCAGCT CCATTCCTGG 
5901 GGGACTTCTC CCAAGCAGAG CAGCCGCAGG CACGGCATAA GCTGAATATC 
5951 TTGGCCCACA GAGCCCCTGC TCATTGCTCT CCTACCTGGG CCCCTTTGGA 
6001 AAGGCCTCAA AGGTCAATCA GTCTTTCTGG AGTTCCCAGA AAGCACAGCC 
6051 CTGCACTGGG TTTAAGAGCT GGGCTTGGGC CAGGCATGGT GGCTCTTGCC 
6101 TGTATTCCCA GCACTTTGGG AGGCCGAAGC GGTCAGATCA CAAGGTCAGG 
6151 AGTTTGAGAC CAGCCTGGCC AACATGGTGA AACCCCGTCT CTACTAAAAA 
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TACAAAAATT AGCCAGGTGT AGTGGCACGC TCCTGCAGTC CCAGCTACTC G(, ^ 

GGGAGGCTGA GGCAGGAGAA TCGCTCAAAT CCGGGTGGTG GAGGTTGCAG 
6301 TGAGCTGAGA TCGCGCCACT GCACTCCAGC CTGGGCAACA AAGTGAGACT 
6351 GCGTCTCAGA AAAAAAAAAA AAAAAAGAGC TGGGCTGGCC ATGTTGGGAG 
64 01 ACAGCAGCTC ACCAGGGACC CTCCCTCTCA CCTTGACGAC TCCATCTTAC 

6451 AAATCTGCAT CAGGGATGCT AGACGCTGCA CACCTGAAGT GTTCAATAGA 'tyTri 
6501 GAAAAGGTCT CACCCTGGCA GGTGGGGCTC TACAGCTTCA AGCAGGCAGA 
6551 AAGCGAACAC TTCCTTCACT AGAGAATTAG TGGGCAGCTA AAGAAAAGGT 

6601 GCTGCTGCAG ATGTAGCCTC AGGTCCCCAG GATGCAGGCA AACACCCCAT * 

6651 CTCCAGGGGC TCGGTCACAG TCCCAAGGCT AGGCTCCAGG AGAGGGAGAC 

6701 CGAAGTGGGG AAAGGGCAGG GCCTCCAGCA GCAACCAGCC CTCCAGCCCT 

6751 GGGCTGCCTG ATCCCTGGAG AGAGCCAGGA TGTTTCTCAG GCTCCTCTTG 

6801 CCCTGCTGTT GTGAGAAGGC AGTTACAGTC CTCAGAAGGG ACGACTCCAC 

6851 AGTGGAGGTG TCTGGGTATG GGGTTCCTGC TGCCCTGATG GTATGATCTG 

6901 GCTGGAGACG GTTCTGGGGC TCACTGCACC CACTCTAGGC CTGGAGAGGG 

6951 AACAAGAGAG GACGTCTGCA GAGCTGAGGA GCCACATGAC TCCTGCCCTC 

7001 CCATCCTCTG CCTTTTTCTC TTTCAGAACA AGGTGGTGAC TGTGGATGGC 

7051 GTGAGAGTGA AGCTGCAGGT GAGACCAGAG GCTGGAGTTG GGGAGGGAGG 

7101 ATGGAGGACC TGCCCTTCCT TCTCACCCTG AACCACAGGA GGCCTGCAGC 

7151 CCTGCCCTCC GCCTGGGGCA ATTTCCTGTG GGGCCCACGG GAGGAAATGG 

7201 CTTTTGTTTA TTTGACATCT GCAGAAAAAG CAGTTCCCAG "GCACCCTCTC 

7251 ATCTATGAAC AGCAGCTCCA AATGCCTTCA GACAAGCTTA GCCTCCATCC 

7301 ATCTCCTCCC CAGTTGCCAG GGCTTTATCT GCTCTTAGGA GATTGGACAT 

7351 CCCCAACCCC TGAGCTAGGG GAGAGGAGAA GATTCTTTTT TTTTCTTTTC 

7401 TTTTCTTTTT TTTTTTGAGA TGGAGTCTCG CTCTGTCGCC CAGGCTGGAG 

7451 TGCAGTGGCA CAATCTCGGC TCACTGCAAC CTCTGCCTCC CAGGTTTAAG 

7501 AGATTCTCCT GCCTCAGCCT CCTGAGTAGC TGAGACTACA GGTGCATGCC 

7551 ACCACACCTG GCTAATTTTT TGTATTTTTA GTAGAGACGG GGTTTCACTG 

7601 TGTTAGCCAG GATGGTCTGG ATCTCCTGAC CTCGTGATCC GCCTGCCTCG 

7651 GCCTCCCAAA GTGCTGGGAT TACAGGTGTA AGCCACCGCG CTCGGCTGAG 

7701 GAGATGATTT TGAACGAGCT TGAGAAATCA GTAACTGCTA CTGTCCAGGT 

7751 CATTGGATGC TCAGGGGCTC ATGAGAACCT AAAGAAGAAA ACAGCCCCAC 

7801 CTTCCCACAG ATATCTCATA CAACAAAGCA GGCCTGCTCC ACCCAGCACA 

7851 TTCCTTGCAC CTGCCTCCTT CTGACCATTT CTCCATCCCA TCCCTTCCCA 

7901 GATCTGGGAC ACCGCTGGGC AGGAACGGTT CCGAAGCGTC ACCCATGCTT 

7951 ATTACAGAGA TGCTCAGGGT GAGTCCCTCG CACCCTCCAA CCCCTACCCC 

8001 AGCCCCTTGG TAGCATCCGT GCTGCTGCCT AAGTCCCCTC TGTGATCCTC 

8051 TCCCCTCCAG CCTTGCTTCT GCTGTATGAC ATCACCAACA AATCTTCTTT 

8101 CGACAACATC AGGGTAGGTC CTCCCTTCCC CTGACTCCCA CCCATAAGCA 

8151 GCCAAGGCAA GGTCTATGCA GGCTGGGGTT GCTTCCTGCC CTGTGGAAAG 

8201 CGGGTGGAGC GTGGAGTCCT CCTGCCTTCT GAAAAACACC TACTTGTGAC 

8251 TCAGAAGTCA TATCTGCTGC TTTGTATTTG GTGGCCATGT GGGCATGAAG 

8301 GCCAAGCAGG CTGTTGTGAC CCTGTGCCAC CTGCATAGCC CTCACTGTGA 

8351 TTCACGAGTG TGTTTCGTGA CAAAGTGTTC AGAACAGCCC CCACTCCACC 

84 01 CTGGATAATT ATCCACAGAG ACCAAGGGAA AAACACAACC AGAAAAGTCC 

84 51 ACACATACAT CCAGGGCAAG TTGCAAGAAA GTGACTCAGT CAGACAGAGT 

8501 GAGTGGTTGT ATCCTCACAA CCAAACTATT ATAGAGACAA AAATTTGATA 

8551 AATTCAAGCA CCAATTTTGT TCACGACATT GTATAGGTTT CATGAATCCC 

8601 CTGACCTCAA GGACAGTTTG CTGATAAGCA AACTAGGAGA ATAAAACGTT 

8651 TATATAGAAA GAGGAAAATC CATGGCACTC ATACTCCTAC CTCCAACCCC 

8701 ATGCTCATGG CAGACATCAC TAATCAATCA CAGTACTTTT GATCACTGAA 

8751 ACCCTTATGT GGTCTTAGAA TCTTTAACAG GACACTCCAA GAAATCACTG 

8801 CTGACAGCCA ACTGATTTGT GAGATAAGGT CTCCATGCAT CTGGATCTTC 

8851 CATAGAACTG ATAGTTGCAC AGCATAAAAT GGTGAGGGTG GGGCCATTGT 

8901 GGGTTGAGCC ACCAAGGAAG GCCATCCAGG CCTGGATGGG CCAGAACAAA 

8951 GGTACAGATG AGAGAACGCA CAGGGTATCG TGTTCAAGGT AGTGAGTAAC 

9001 TGAGGATAGT CAAACGGAGC AGAAGAAGAA AGGGGCAGCA GGAGGAAGAG 

9051 AATGCCAGTC TCGCACGCCC TCTCCCACAG GCCTGGCTCA CTGAGATTCA 

9101 TGAGTATGCC CAGAGGGACG TGGTGATCAT GCTGCTAGGC AACAAGGTGA 

9151 GTGGCTCCGG GGCAGGGTCA GCCCAGCCCT GCACTTCCTC AGCCCTAGCC 

9201 GGCCCCATAA CCACCCAAGA ACAGTTATCT AGGCATCCTT CCTGAAAAGG 

9251 ACTCTGCAGC CTCCAGCTCA GGGGTCAGAC ATATCTGGAG GCTTCTGCCC 
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9301 ATCCCATCTG CCCCTTCCAG GGAAAGTCCA AGTTGTTGCC TGAGAAATCA 
9351 AGGGGTGCCC AGTTCTCAGC CCCCATTAGA GCAGAGTGAA CAGGGTCCCA 
9401 GGTCAGGGGC TAAGAGTGCA AAGGGTTAGC CCCAACTGCT GTCCTATTCC 
9451 AAGACCCTTT ACCAAAGGTG AGATCCCAGA GCTGGGAGCT ACACTGGGCA 
9501 GAAACCCTGG CCCCAGGCCA ATCACACCTG CCTGCAGTCC CTTGGGCCAC 
9551 CAGCAGAGGG CAGGCAACGC CTGCTTCTGG GGCAAAATAT GGGCCCGCTG 
9601 GGGCGGAGGC CTCCTTCCCC AGAGTGACCC ATTTGGGCTT GACAGGCGGA 
9651 TATGAGCAGC GAAAGAGTGA TCCGTTCCGA AGACGGAGAG ACCTTGGCCA 
9701 GGGTAAGTGA TTGTCTGTGG GACAGGGTGA AGGGTGGGGG CAACCCGACG 
9751 CTGGCCCTGA GGACACTCTC TCCCGGGCAG GAGTACGGTG TTCCCTTCCT 
9801 GGAGACCAGC GCCAAGACTG GCATGAATGT GGAGTTAGCC TTTCTGGCCA 
9851 TCGCCAAGTG AGAGCTGGGC AGGGAAGGGA AGTGTGCGGG GCAGGGCGGC 
9901 ACACTCCAGG AATCCAGTAG GGCCCGGCCC CTGGCCCAGC CCCTGGACAC 
9951 ACCTGCATTC TGCAGGCTGA GGTCCATTTG CTCTGGGAGC ACTGGGCCAC 
10001 TGGGAGAGGG GAGGGGGCGG CTCAGCTCCT CACCCCAGCC CAGCCCAGCC 
10051 CAGCCCAGCC CATTGTCTCT TCTTCAAGGG AACTGAAATA CCGGGCCGGG 
10101 CATCAGGCGG ATGAGCCCAG CTTCCAGATC CGAGACTATG TAGAGTCCCA 
10151 GAAGAAGCGC TCCAGCTGCT GCTCCTTCAT GTGAATCCCA GGGGGCAGAG 
10201 AGGAGGCTCT GGAGGCACAC AGGATGCAGC CTTCCCCCTC CCAGGCCTGG 
10251 CTTATTCCAA GAGGCTGAGC CAATGGGGAG AAAGATGGAG GACTCACTGC 
10301 ACAGCCGCTT CCTAGCAGGG AGCTATACTC CAACTCCTAC TTGAGTTCCT 
10351 GCGGTCTCCC CGCATCCACA GGGAGGGTAA AACACTTAGC TTTTATTTTA 
10401 ATAGTACATA ATTTAATACC AAAAAAGGCG CCTGGATCCC CAAAAAACCG 
10451 AGGCTGGGAG CTAGTGGCCC TTTTGCTTTC TAGGACTTGG GGGGCCGGCC 
10501 CTCCCTCCTA AGCATAACAA AGGTGGTGTT GCTCCAGCTC AGCCCCAGGG 
10551 GACACAGATG CACTTTGGGG GTGAGGGCAG GTAATGACTC CATCGCACCC 
10601 TCAGTTCAGC TGGACAGAGG CTCAGGTGAC CCCAGCCTTC ACTGTCTCCC 
10651 GCTCTCCAGG AGCTTATCTT CGCCCCATCT CCCAAATAAG TGGGCCCTTG 
10701 TGCTGTGAGG AAGACCAAAG CCTCAGGGAA GATAAGAGAT ATGGAGATGG 
10751 GAGGGGGAGG ACAAGGGGCA GAGAGTAGGG TCTAGCTGGC TATCTCTGGC 
10801 CTTACTAACA CCCCCCTGGA GGCATGCCCC TTTTCTCCAG CACACAAGCA 
10851 CATTGGGGCA CCTGGAAATA TTGGTTCCAG GCTCCTGTTC TCTGGACTTC 
10901 AGATCCTGGG GGAGCCCCTC CCCCCCCTGA ATCCCTGGCT TAGCTACCTT 
10951 CCTGCCTGTG CACCTAAAAA CCTCAGGTCA GAACTAGGAA AAGAGTTTTG 
11001 TTTTTATTTT TTTGAAATGG AGTCTCGTTC TGTCGCCCAG GCTGAGGTGC 
11051 AGTAGTGCAA TCTCCGCTCA CTACAACCTC CACTCCCTGG GGCTCAAGCG 
11101 ATCCTCCCAC CTCAGCCGCC GAAGTAGCTG GGACTATAGG TGTGTACCAT 
11151 CACACCTGGC TAATTTTTGT ATTTTTTGTA GACACAGGGT TTCGCCATGT 
11201 TGCCCAGGCT GGTCTTGAAT TCCTGAGCTC AAGCAACCTG CCGGCCTCGG 
11251 CCTCCCAAAG TACTGGGATT ACACGCAGAA GGCACCATGC CCAGGCTAGA 
11301 TGTGTCTTAT CCCAATCCTT TGGCAGGCAT GCAGCTCCAC AGGCGATTTC 
11351 TTCAAGCAGC TGAAGTGTTT AGCCCTCCTG GGTTAAGAGC CAGATAAGGA 
11401 GAAATCCCTT TCCTAGGTTT GGAATGTGTT GTGAAAAAAA AGAGAAATCC 
11451 CTGGCTCCTG GAGCTGGTGG GAGACAAGAT TAAGCAAACC TCCCCTGACA 
11501 TGTATCCCTT TGACCCCAAG CTCTGCCTCC TCCCTGACCA CCCATGCCCT 
11551 TTCCTTTAAC TTCTCAAACA GATACCAGGG CCTAAACTGC TTTACCTCCC 
11601 CTCCTACTGA GTCAGGTTAG GTGGTGGGAG GTCACCCATT TCCGAGTTAA 
11651 ACCAATGCAA TATGAGTAAA ACAAAGTCAT GTGGGTATGT CTGGGGTAGA 
11701 GAGAGGGGTA GCAAGTTCAT GTGTCCTCCT TGGTCACATA TCTCCCAAAG 
11751 CTCTGATCCC TGCCATGGGA AGTGGACAGG AAACATGAGG TCATGACCTG 
11801 CAGGCATCTT TACTGCAGCT CTGCCGGCCT GGAGGGGGAG AGGGGGAGGA 
11851 AGAAGTATGC GCTGCACATT TCTGAGGCTA CTGCATTTGC TTTCAAGGCA 
11901 GAAATCTTGC TCTGAGCAGT CAGCGGCTCC AGTTTGGGCC CGATAAGGAA 
11951 GTTCTCCGTG GCCTCCCTCA GGCAGAGCAG GGAGGAGGCT GACATTGCCA 
12001 GTCTCTTCTG GGGCCCAAGG CAGGTTGCAG GAGATCCAAT CCCATAGACA 
12051 GCTCTGGGCC TCTTGCATTT GAGTTTTTCA GAATTAAACT GCAGTATTTT 
12101 GGAAAGCACA TCCTGTCCAC TGTTTCTTTG AAGTGAGTGG GGGGGGGGGG 
12151 TCTTGTTGAA GGAATTGTCA TTCACTGCCA AAATCATTCC ATCCTCCTTC 
12201 CTCAGTGTCT GTCCTCAGAT GGTCAGCTCC CCGCTCAACA GACTGTCTCC 
12251 CGCCTCTGTG ACCAGCCTCT CTTTGGCAAG AGGGAGCTAG AAGGCTTTAC 
12301 AGTCCTAATC ATTTTTCTGT TGGAAAAAAA AAAAAAAAAC CAAGGCTCCT 
12351 TTCCCTGTGG CGTGTACCCA GAGGTTGATT ACCTGAGTCT GTCCTGCCTC 
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" ^ZBADE^^ 12401 TCCCCACCCC ACCTCCCTAG CCAAACGCTG CTGCCAAAGC CCACGCTATT ^ 



124 51 GCCCTAGATG GCCTGTCTTC AGCGGGCTGC CCCTCGAGGT CCCAGGCTCT VT ^ \^5> 

12501 CCGCGGAGCC CTCACCTTCC CAGCAGGGAT CAGAACCTGC ACTCCTCTAT 

12551 GCGAGTCCTG GGACAGCACA AAGTGGATTA GGGTTAGGGT TCCCACAAAC ^ 

12601 GGAAAAATGT TATTCAAACA ACTCTGTAGG GTCCGAGGAG GCCCTCCGTC <j|> 

12651 TTAATTCTCG AGACTGACCG GCCCTCGCTG CCCCGAGCGG GAGCAGTTGC -Jj^ Q 

12701 CCCGGCAACA GCCGCTCCCT CTCAACTGGA GCTGCACCCA GGCTTTGGCT 

12751 AAAGGCTGTT AAAACGTTGG CCAGGTGCGG AGGCTCACGT CTGTAATCCC 

12801 AGGGCGGATC ACCTGAGGTC AGGAGTTTGA AACCATCCTG GCCAACATGG 

12851 CGAAATTTCG TCTCTACTAA AAATACAAAA ATTAGCGGGG CGTGGTGGTG 

12901 CGCGCCTGTA ACCCCAGCTG CTCGGGAGGC TGAGGCAGGG GAATCGCTTG 

12951 AACCCGGGAG GCGGAGGTTG CAGTGATCCG AGATCGCGCC ACGGCAGTCC 

13001 AGCCTGGGCG ACAGAGCGAG ACTCCGTCTC AAAAAAAAAA AAAAAAGTTA 

13051 GGGTCCTTTA CCCGAGGGCC GGCTTTCCTC ACTCCCCGCC ACAGGTAGGG 

13101 GAAACCAGGC CGGAGCCGGC GGGCCCACCC GCCCAGAACC GGGAATTCGG 

13151 CGAGCCCCGC CCCTGCCACC CCAGCGCCGG CC (SEQ ID NO:3) 
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Context : 



DNA 

Position 



4259 



NO : 16") 
4325 



NO:17) 
4348 



4924 



NO: 19) 
4983 



ACCCATTAAGCCACCTAACCAGCAGCTGGGAAATTCCAGCATTGGATCTAGACCCCTGTT 
ATCCAAGATTGGAGAACAGTGGGACAAAGTGCTCCTCTCCACCATTCCTGCGTGTCCCTG 
GGGAAGATGAGCAGAGCAGAGCCAGACAGTAAAGGAGAGGGCCACGCCCCCTCCACAGGT 
TACCTCCTTGGTACTCCTGCCCGCACTACCCACAGCAACCCCGGGATGCCGATCTGCAGC 
CACATGTCCCATGTGGGAGGTTTCTGCTGAAAGAACTTCCAACTACACATCTCCCCACTT 
[C,T] 

ACAGAGTGTCGCTCTGTCACCGAGGCTGGAGTTCAGTGATGCAATCTCGGCTCACTGCAA 
CCTCTACCTCCTGGGTTCAAGCTATTCTCCTGTCTCCGCCTCCCAAGTAACTGGGACTAC 
AGGCGTGTGCCACCACTCCTGGCTAGTTTTTTGTATTTTTAGTAGAGATGGGGTTTCACC 
TTGTTGGTCAGGCTGGTCTCAAACTCCCAACTCAGGTGATCCGTCCACTTGGGCACCCAA ( SEQ ID 



GATTGGAGAACAGTGGGACAAAGTGCTCCTCTCCACCATTCCTGCGTGTCCCTGGGGAAG 
ATGAGCAGAGCAGAGCCAGACAGTAAAGGAGAGGGCCACGCCCCCTCCACAGGTTACCTC 
CTTGGTACTCCTGCCCGCACTACCCACAGCAACCCCGGGATGCCGATCTGCAGCCACATG 
TCCCATGTGGGAGGTTTCTGCTGAAAGAACTTCCAACTACACATCTCCCCACTTCAGTAT 



TGTCGCTCTGTCACCGAGGCTGGAGTTCAGTGATGCAATCTCGGCTCACTGCAACCTCTA 
CCTCCTGGGTTCAAGCTATTCTCCTGTCTCCGCCTCCCAAGTAACTGGGACTACAGGCGT 
GTGCCACCACTCCTGGCTAGTTTTTTGTATTTTTAGTAGAGATGGGGTTTCACCTTGTTG 
GTCAGGCTGGTCTCAAACTCCCAACTCAGGTGATCCGTCCACTTGGGCACCCAAAATG ( SEQ ID 



TGCTCCTCTCCACCATTCCTGCGTGTCCCTGGGGAAGATGAGCAGAGCAGAGCCAGACAG 
TAAAGGAGAGGGCCACGCCCCCTCCACAGGTTACCTCCTTGGTACTCCTGCCCGCACTAC 
CCACAGCT^ACCCCGGGATGCCGATCTGCAGCCACATGTCCCATGTGGGAGGTTTCTGCTG 
AAAGAACTTCCAACTACACATCTCCCCACTTCAGTATAAATTTCAACCTTCCCTAATTCA 
TGCAACCTTTTTTTTTTTTTTTTTTTTTTGAGACAGAGTGTCGCTCTGTCACCGAGGCTG 
[G, A] 

AGTTCAGTGATGCAATCTCGGCTCACTGCAACCTCTACCTCCTGGGTTCAAGCTATTCTC 
CTGTCTCCGCCTCCCAAGTAACTGGGACTACAGGCGTGTGCCACCACTCCTGGCTAGTTT 
TTTGTATTTTTAGTAGAGATGGGGTTTCACCTTGTTGGTCAGGCTGGTCTCAAACTCCCA 
ACTCAGGTGATCCGTCCACTTGGGCACCCAAAATG (SEQ ID NO: 18) 

TTCAAGTACCAGCCTGGCCAACATGGTAGAAACCCCGTCTCTACTAAAAATAAAAAATTA 
GCCAGGCGAGGTGGTGCATGCCTATAATCCCAGCTACTCAGGTAGGCTGAGGCAGGAGAA 
TCATTTAAACCTGGGAGGTGGAGGTTGTGGTGAGCCAAGATCTCGCCATTGCACTCCAGC 
CTGGGCAACAAGAGCAAAACTCC 



TCTCAAAAAAAAAAAGAAAGAAAGAAAGAAAGAAACTTCCAAATAAATGTTGTGACACAA 
AAAAAAAAACCCAAACAATATTCATTATAGAGTATGCT^AATGACCATGCCCCACCCCCAG 
CAGATTCTGATAGACTCCCTTGGGTGGGAATCCTTGTCCAATATATTGACACTTCCCTTT 
CCTGTCAGTATAGCCCAGCCCATGCGTGTACTCACGAGCGGACGATGGATGACACAAGTA 
CACAGAGGGACGGAATCCCTGCATGGTGTGGCTATGGGCAAATGTGGCCACTGTCTAGAT (SEQ ID 



TTCAAGTACCAGCCTGGCCAACATGGTAGAAACCCCGTCTCTACTAAAAATAAAAAATTA 
GCCAGGCGAGGTGGTGCATGCCTATAATCCCAGCTACTCAGGTAGGCTGAGGCAGGAGAA 
TCATTTAAACCTGGGAGGTGGAGGTTGTGGTGAGCCAAGATCTCGCCATTGCACTCCAGC 
CTGGGCAACAAGAGCAAAACTCCGTCTCAAAAAAAAAAAGAAAGAAAGAAAGAAAGAAAC 
TTCCAAATAAATGTTGTGACAC 
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[-.A] 

AAAAAAAAAACCO^CAATATTC^TTATAGAGTATGCAAATGACCATGCCCCACCCCCA 
GCAGATTCTGATAGACTCCCTTGGGTGGGAATCCTTGTCCAATATATTGACACTTCCCTT 
TCCTGTCAGTATAGCCCAGCCCATGCGTGTACTCACGAGCGGACGATGGATGACACAAGT 
ACACAGAGGGACGGAATCCCTGCATGGTGTGGCTATGGGCAAATGTGGCCACTGTCTAGA 
TTGTGCAAATGTGGTGGTTCTCTGGGGCCACAGAGCACACTTGGGGACCTGTTCATGGTG ( SEQ ID 



CACCAGGGACCCTCCCTCTCACCTTGACGACTCCATCTTACAAATCTGCATCAGGGATGC 
TAGACGCTGCACACCTGAAGTGTTCAATAGAGAAAAGGTCTCACCCTGGCAGGTGGGGCT 
CTACAGCTTCAAGCAGGCAGAAAGCGAACACTTCCTTCACTAGAGAATTAGTGGGCAGCT 
AAAGAAAAGGTGCTGCTGCAGATGTAGCCTCAGGTCCCCAGGATGCAGGCAAACACCCCA 
TCTCCAGGGGCTCGGTCACAGTCCCAAGGCTAGGCTCCAGGAGAGGGAGACCGAAGTGGG 
[A, G] 

AAAGGGCAGGGCCTCCAGCAGCAACCAGCCCTCCAGCCCTGGGCTGCCTGATCCCTGGAG 
AGAGCCAGGATGTTTCTCAGGCTCCTCTTGCCCTGCTGTTGTGAGAAGGCAGTTACAGTC 
CTCAGAAGGGACGACTCCACAGTGGAGGTGTCTGGGTATGGGGTTCCTGCTGCCCTGATG 
GTATGATCTGGCTGGAGACGGTTCTGGGGCTCACTGCACCCACTCTAGGCCTGGAGAGGG 
AACAAGAGAGGACGTCTGCAGAGCTGAGGAGCCACATGACTCCTGCCCTCCCATCCTCTG ( SEQ ID 



GTGCCACCTGCATAGCCCTCACTGTGATTCACGAGTGTGTTTCGTGACAAAGTGTTCAGA 
ACAGCCCCCACTCCACCCTGGATAATTATCCACAGAGACCAAGGGAAAAACACAACCAGA 
AAAGTCCACACATACATCCAGGGCAAGTTGCAAGAAAGTGACTCAGTCAGACAGAGTGAG 
TGGTTGTATCCTCACAACCAAACTATTATAGAGACAAAAATTTGATAAATTCAAGCACCA 
ATTTTGTTCACGACATTGTATAGGTTTCATGAATCCCCTGACCTCAAGGACAGTTTGCTG 
[A, G] 

TAAGCAAACTAGGAGAATAAAACGTTTATATAGAAAGAGGAAAATCCATGGCACTCATAC 
TCCTACCTCCAACCCCATGCTCATGGCAGACATCACTAATCAATCACAGTACTTTTGATC 
ACTGAAACCCTTATGTGGTCTTAGAATCTTTAACAGGACACTCCAAGAAATCACTGCTGA 
CAGCCAACTGATTTGTGAGATAAGGTCTCCATGCATCTGGATCTTCCATAGAACTGATAG 
TTGCACAGCATAAAATGGTGAGGGTGGGGCCATTGTGGGTTGAGCCACCAAGGAAGGCCA ( SEQ ID 



TGTTTCGTGACAAAGTGTTCAGAACAGCCCCCACTCCACCCTGGATAATTATCCACAGAG 
AC CAAGGG AAAAACAC AAC C AG AAAAGTC C ACAC AT ACAT C C AGGGC AAGTTG C AAGAAA 
GTGACTCAGTCAGACAGAGTGAGTGGTTGTATCCTCACAACCAAACTATTATAGAGACAA 
AAATTTGATAAATTCAAGCACCAATTTTGTTCACGACATTGTATAGGTTTCATGAATCCC 
CTGACCTCAAGGACAGTTTGCTGATAAGCAAACTAGGAGAATAAAACGTTTATATAGAAA 
[G, A] 

AGGAAAATCCATGGCACTCATACTCCTACCTCCAACCCCATGCTCATGGCAGACATCACT 
AATCAATCACAGTACTTTTGATCACTGAAACCCTTATGTGGTCTTAGAATCTTTAACAGG 
ACACTCCAAGAAATCACTGCTGACAGCCAACTGATTTGTGAGATAAGGTCTCCATGCATC 
TGGATCTTCCATAGAACTGATAGTTGCACAGCATAAAATGGTGAGGGTGGGGCCATTGTG 
GGTTGAGCCACCAAGGAAGGCCATCCAGGCCTGGATGGGCCAGAACAAAGGTACAGATGA (SEQ ID 



GCTCCTGGAGCTGGTGGGAGACAAGATTAAGCAAACCTCCCCTGACATGTATCCCTTTGA 
CCCCAAGCTCTGCCTCCTCCCTGACCACCCATGCCCTTTCCTTTAACTTCTCAAACAGAT 
ACCAGGGCCTAAACTGCTTTACCTCCCCTCCTACTGAGTCAGGTTAGGTGGTGGGAGGTC 
ACCCATTTCCGAGTTAAACCAATGCAATATGAGTAAAACAAAGTCATGTGGGTATGTCTG 
GGGTAGAGAGAGGGGTAGCAAGTTCATGTGTCCTCCTTGGTCACATATCTCCCAAAGCTC 
[T,C] 

GATCCCTGCCATGGGAAGTGGACAGGAAACATGAGGTCATGACCTGCAGGCATCTTTACT 
GCAGCTCTGCCGGCCTGGAGGGGGAGAGGGGGAGGAAGAAGTATGCGCTGCACATTTCTG 
AGGCTACTGCATTTGCTTTCAAGGCAGAAATCTTGCTCTGAGCAGTCAGCGGCTCCAGTT 
TGGGCCCGATAAGGAAGTTCTCCGTGGCCTCCCTCAGGCAGAGCAGGGAGGAGGCTGACA 
TTGCCAGTCTCTTCTGGGGCCCAAGGCAGGTTGCAGGAGATCCAATCCCATAGACAGCTC (SEQ ID 



GACCACCCATGCCCTTTCCTTTAACTTCTCAAACAGATACCAGGGCCTAAACTGCTTTAC 
CTCCCCTCCTACTGAGTCAGGTTAGGTGGTGGGAGGTCACCCATTTCCGAGTTAAACCAA 
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TGCAATATGAGTAAAACT^AAGTCATGTGGGTATGTCTGGGGTAGAGAGAGGGGTAGCAAG 
TTCATGTGTCCTCCTTGGTCACATATCTCCCAAAGCTCTGATCCCTGCCATGGGAAGTGG 
ACAGGAAACATGAGGTCATGACCTGCAGGCATCTTTACTGCAGCTCTGCCGGCCTGGAGG 
[A, G] 

GGAGAGGGGGAGGAAGAAGTATGCGCTGCACATTTCTGAGGCTACTGCATTTGCTTTCAA 
GGCAGAAATCTTGCTCTGAGCAGTCAGCGGCTCCAGTTTGGGCCCGATAAGGAAGTTCTC 
CGTGGCCTCCCTCAGGCAGAGCAGGGAGGAGGCTGACATTGCCAGTCTCTTCTGGGGCCC 
AAGGCAGGTTGCAGGAGATCCAATCCCATAGACAGCTCTGGGCCTCTTGCATTTGAGTTT 
TTCAGAATTAAACTGCAGTATTTTGGAAAGCACATCCTGTCCACTGTTTCTTTGAAGTGA 



(SEQ ID 



NO: 25) 
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